Hardware managed power collapse and clock wake-up for memory management units and distributed virtual memory networks

ABSTRACT

Methods and systems are disclosed for full-hardware management of power and clock domains related to a distributed virtual memory (DVM) network. An aspect includes transmitting, from a DVM initiator to a DVM network, a DVM operation, broadcasting, by the DVM network to a plurality of DVM targets, the DVM operation, and, based on the DVM operation being broadcasted to the plurality of DVM targets by the DVM network, performing one or more hardware optimizations comprising: turning on a clock domain coupled to the DVM network or a DVM target of the plurality of DVM targets that is a target of the DVM operation, increasing a frequency of the clock domain, turning on a power domain coupled to the DVM target based on the power domain being turned off, or terminating the DVM operation to the DVM target based on the DVM target being turned off.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application for patent is a continuation of U.S. patentapplication Ser. No. 15/086,054, entitled “HARDWARE MANAGED POWERCOLLAPSE AND CLOCK WAKE-UP FOR MEMORY MANAGEMENT UNITS AND DISTRIBUTEDVIRTUAL MEMORY NETWORKS,” filed Mar. 31, 2016, assigned to the assigneehereof, and expressly incorporated herein by reference in its entirety.

FIELD OF DISCLOSURE

Aspects of this disclosure relate to hardware managed power collapse andclock wake-ups for memory management units (MMUs) and distributedvirtual memory (DVM) networks, and related concepts.

BACKGROUND

A “DVM network” is a broadcast network within the hardware/softwarearchitecture of a system-on-a-chip (SoC) designed to broadcast “DVMoperations” from a “DVM initiator” to all “DVM targets” of the DVMnetwork. The DVM network is responsible for merging responses from theDVM targets and presenting a single unified response back to the DVMinitiator. DVM operations may include translation lookaside buffer (TLB)invalidate operations to TLBs located at a DVM target, synchronizationoperations to ensure completion of previous DVM operations, instructioncache invalidate operations to instruction caches located at a DVMtarget, and other related operations.

DVM networks use a protocol based on the Advanced Microcontroller BusArchitecture (AMBA) 4 Advanced Extensible Interface (AXI) CoherencyExtensions (ACE) standard from ARM Ltd. AMBA 4 is an open-standard,on-chip interconnect specification for the connection and management offunctional blocks in SoC designs. The standard specification onlydescribes the “protocol” for DVM networks and does not mandate aspecific implementation of the DVM network.

SUMMARY

The following presents a simplified summary relating to one or moreaspects and/or aspects disclosed herein. As such, the following summaryshould not be considered an extensive overview relating to allcontemplated aspects and/or aspects, nor should the following summary beregarded to identify key or critical elements relating to allcontemplated aspects and/or aspects or to delineate the scope associatedwith any particular aspect and/or aspect. Accordingly, the followingsummary has the sole purpose to present certain concepts relating to oneor more aspects and/or aspects relating to the mechanisms disclosedherein in a simplified form to precede the detailed descriptionpresented below.

A method for full-hardware management of power and clock domains relatedto a distributed virtual memory (DVM) network, includes transmitting,from a DVM initiator to a DVM network, a DVM operation, broadcasting, bythe DVM network to a plurality of DVM targets, the DVM operation, and,based on the DVM operation being broadcasted to the plurality of DVMtargets by the DVM network, performing one or more hardware functionscomprising: turning on a clock domain coupled to the DVM network or aDVM target of the plurality of DVM targets that is a target of the DVMoperation, increasing a frequency of the clock domain coupled to the DVMnetwork or the DVM target of the plurality of DVM targets that is thetarget of the DVM operation, turning on a power domain coupled to theDVM target of the plurality of DVM targets that is the target of the DVMoperation based on the power domain being turned off, terminating theDVM operation to the DVM target of the plurality of DVM targets that isthe target of the DVM operation based on the DVM target being turnedoff, or any combination thereof.

An apparatus for full-hardware management of power and clock domainsrelated to a DVM network, includes a DVM initiator, a plurality of DVMtargets, a DVM network coupled to the DVM initiator and the plurality ofDVM targets, wherein the DVM network is configured to broadcast DVMoperations from the DVM initiator to the plurality of DVM targets,wherein, based on a DVM operation in the DVM network being broadcastedto the plurality of DVM targets: a clock domain coupled to the DVMnetwork or a DVM target of the plurality of DVM targets that is a targetof the DVM operation is turned on, a frequency of the clock domaincoupled to the DVM network or the DVM target of the plurality of DVMtargets that is the target of the DVM operation is increased, a powerdomain coupled to the DVM target of the plurality of DVM targets that isthe target of the DVM operation is turned on based on the power domainbeing turned off, the DVM operation to the DVM target of the pluralityof DVM targets that is the target of the DVM operation is terminatedbased on the DVM target being turned off, or any combination thereof.

An apparatus for full-hardware management of power and clock domainsrelated to a DVM network, includes means for transmitting, to a DVMnetwork, a DVM operation, means for broadcasting, to a plurality of DVMtargets, the DVM operation, and means for performing, based on the DVMoperation being broadcasted to the plurality of DVM targets by the DVMnetwork, one or more hardware functions comprising: turn on a clockdomain coupled to the DVM network or a DVM target of the plurality ofDVM targets that is a target of the DVM operation, increase a frequencyof the clock domain coupled to the DVM network or the DVM target of theplurality of DVM targets that is the target of the DVM operation, turnon a power domain coupled to the DVM target of the plurality of DVMtargets that is the target of the DVM operation based on the powerdomain being turned off, terminate the DVM operation to the DVM targetof the plurality of DVM targets that is the target of the DVM operationbased on the DVM target being turned off, or any combination thereof.

Other objects and advantages associated with the aspects and aspectsdisclosed herein will be apparent to those skilled in the art based onthe accompanying drawings and detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

A more complete appreciation of aspects of the disclosure and many ofthe attendant advantages thereof will be readily obtained as the samebecomes better understood by reference to the following detaileddescription when considered in connection with the accompanying drawingswhich are presented solely for illustration and not limitation of thedisclosure, and in which:

FIG. 1 is a block diagram of an exemplary processor-based system thatcan include a plurality of system memory management units (SMMUs)according to at least one aspect of the disclosure.

FIG. 2 illustrates an exemplary system that includes a distributedvirtual memory (DVM) initiator, a DVM network, and DVM targets accordingto at least one aspect of the disclosure.

FIG. 3A illustrates an exemplary TLB Invalidate by Virtual Address(TLBIVA) operation performed by the system of FIG. 2 according to atleast one aspect of the disclosure.

FIG. 3B illustrates the system of FIG. 2 in which each of the DVMinitiators, the DVM network, and the DVM targets are on separate clockand power domains according to at least one aspect of the disclosure.

FIG. 4 illustrates an exemplary system for full-hardware management ofpower and clock domains related to a DVM network and DVM targetsaccording to at least one aspect of the disclosure.

FIG. 5 illustrates an exemplary flow for power collapsing a DVM targetin the system of FIG. 4 according to at least one aspect of thedisclosure.

FIG. 6 illustrates an exemplary flow for powering on a DVM target in thesystem of FIG. 4 according to at least one aspect of the disclosure.

FIG. 7A illustrates an exemplary flow for automatic clock wake-up in thesystem of FIG. 4 according to at least one aspect of the disclosure.

FIG. 7B illustrates an exemplary flow for automatic clock wake-up in thesystem of FIG. 4 according to at least one aspect of the disclosure.

FIG. 8 illustrates an exemplary flow for full-hardware management ofpower and clock domains related to a DVM network according to at leastone aspect of the disclosure.

DETAILED DESCRIPTION

Methods and systems are disclosed for full-hardware management of powerand clock domains related to a distributed virtual memory (DVM) network.An aspect includes transmitting, from a DVM initiator to a DVM network,a DVM operation, broadcasting, by the DVM network to a plurality of DVMtargets, the DVM operation, and, based on the DVM operation beingbroadcasted to the plurality of DVM targets by the DVM network,performing one or more hardware functions comprising: turning on a clockdomain coupled to the DVM network or a DVM target of the plurality ofDVM targets that is a target of the DVM operation, increasing afrequency of the clock domain coupled to the DVM network or the DVMtarget of the plurality of DVM targets that is the target of the DVMoperation, turning on a power domain coupled to the DVM target of theplurality of DVM targets that is the target of the DVM operation basedon the power domain being turned off, or terminating the DVM operationto the DVM target of the plurality of DVM targets that is the target ofthe DVM operation based on the DVM target being turned off.

These and other aspects of the disclosure are disclosed in the followingdescription and related drawings directed to specific aspects of thedisclosure. Alternate aspects may be devised without departing from thescope of the disclosure. Additionally, well-known elements of thedisclosure will not be described in detail or will be omitted so as notto obscure the relevant details of the disclosure.

The words “exemplary” and/or “example” are used herein to mean “servingas an example, instance, or illustration.” Any aspect described hereinas “exemplary” and/or “example” is not necessarily to be construed aspreferred or advantageous over other aspects. Likewise, the term“aspects of the disclosure” does not require that all aspects of thedisclosure include the discussed feature, advantage or mode ofoperation.

Further, various aspects are described in terms of sequences of actionsto be performed by, for example, elements of a computing device. It willbe recognized that various actions described herein can be performed byspecific circuits (e.g., application specific integrated circuits(ASICs), systems-on-a-chip (SoCs)), by program instructions beingexecuted by one or more processors, or by a combination of both.Additionally, these sequence of actions described herein can beconsidered to be embodied entirely within any form of computer-readablestorage medium having stored therein a corresponding set of computerinstructions that upon execution would cause an associated processor toperform the functionality described herein. Thus, the various aspects ofthe disclosure may be embodied in a number of different forms, all ofwhich have been contemplated to be within the scope of the claimedsubject matter. In addition, for each of the aspects described herein,the corresponding form of any such aspects may be described herein as,for example, “logic configured to” perform the described action.

Generally, unless stated otherwise explicitly, the phrase “logicconfigured to” as used throughout this disclosure is intended to invokean aspect that is at least partially implemented with hardware, and isnot intended to map to software-only implementations that areindependent of hardware. Also, it will be appreciated that theconfigured logic or “logic configured to” in the various blocks are notlimited to specific logic gates or elements, but generally refer to theability to perform the functionality described herein (either viahardware or a combination of hardware and software). Thus, theconfigured logics or “logic configured to” as illustrated in the variousblocks are not necessarily implemented as logic gates or logic elementsdespite sharing the word “logic.” Other interactions or cooperationbetween the logic in the various blocks will become clear to one ofordinary skill in the art from a review of the aspects described belowin more detail.

Providing full-hardware management of power and clock domains related toa DVM network, according to aspects disclosed herein, may be provided inor integrated into any processor-based device. Examples, withoutlimitation, include a set top box, an entertainment unit, a navigationdevice, a communications device, a fixed location data unit, a mobilelocation data unit, a mobile phone, a cellular phone, a server, acomputer, a portable computer, a desktop computer, a personal digitalassistant (PDA), a monitor, a computer monitor, a television, a tuner, aradio, a satellite radio, a music player, a digital music player, aportable music player, a digital video player, a video player, a digitalvideo disc (DVD) player, a portable digital video player, etc.

In this regard, FIG. 1 illustrates an example of a processor-basedsystem 100 according to at least one aspect of the disclosure. In thisexample, the processor-based system 100 includes one or more centralprocessing units (CPUs) 102, each including one or more processors 104.The CPU(s) 102 may have cache memory 106 coupled to the processor(s) 104for rapid access to temporarily stored data. The CPU(s) 102 furtherincludes a CPU memory management unit (MMU) 108 for providing addresstranslation services for CPU memory access requests. The CPU(s) 102 cancommunicate transaction requests to a memory controller 118 of a memorysystem 112, which provides memory units 114A-114N.

The CPU(s) 102 is coupled to a system bus 110 (which includes a DVMnetwork (not shown)) that can intercouple master and slave devicesincluded in the processor-based system 100. The CPU(s) 102 communicateswith these other devices by exchanging address, control, and datainformation over the system bus 110. In the example of FIG. 1, an SMMU116 is coupled to the system bus 110. Other master and slave devices canbe connected to the system bus 110 via the SMMU 116. As illustrated inFIG. 1, these devices can include one or more input devices 120, one ormore output devices 122, one or more network interface devices 124, andone or more display controllers 126, as examples. The input device(s)120 can include any type of input device, including but not limited toinput keys, switches, voice processors, etc. The output device(s) 122can include any type of output device, including but not limited toaudio, video, other visual indicators, etc. The network interfacedevice(s) 124 can be any devices configured to allow exchange of data toand from a network 128. The network 128 can be any type of network,including but not limited to a wired or wireless network, a private orpublic network, a local area network (LAN), a wide local area network(WLAN), the Internet, etc. The network interface device(s) 124 can beconfigured to support any type of communications protocol desired.

The CPU(s) 102 may also be configured to access the displaycontroller(s) 126 over the system bus 110 to control information sent toone or more displays 130. The display controller(s) 126 sendsinformation to the display(s) 130 to be displayed via one or more videoprocessors 132, which process the information to be displayed into aformat suitable for the display(s) 130. The display(s) 130 can includeany type of display, including but not limited to a cathode ray tube(CRT), a liquid crystal display (LCD), a plasma display, etc.

The system bus 110 includes a DVM network that couples a DVM initiator(e.g., CPU(s) 102) to one or more DVM targets (e.g., SMMU 116). The DVMnetwork (part of system bus 110) is included within thehardware/software architecture of a SoC to broadcast “DVM operations”from a “DVM initiator,” such as CPU 102, to all “DVM targets,” such asSMMU 116, of the DVM network.

FIG. 2 illustrates an exemplary system 200 that includes a DVM initiator202, a DVM network 204, and DVM targets 206A to 206N according to atleast one aspect of the disclosure. The DVM network 204 is responsiblefor merging responses from the DVM targets 206A to 206N and presenting asingle unified response back to the DVM initiator 202. Morespecifically, the DVM network 204 waits for all of the responses fromthe DVM targets 206A to 206N, combines (or “merges”) them into a singleresponse, and returns the single response to the DVM initiator 202. DVMoperations may include translation lookaside buffer (TLB) invalidateoperations to TLBs located at a DVM target, synchronization operationsto ensure completion of previous DVM operations, instruction cacheinvalidate operations to instruction caches located at a DVM target,etc.

DVM networks, such as the DVM network 204, may use a protocol based onthe AMBA 4 ACE standard from ARM Ltd. AMBA 4 is an open-standard,on-chip interconnect specification for the connection and management offunctional blocks in SoC designs. The standard specification onlydescribes the “protocol” for DVM networks and does not mandate aspecific implementation of the DVM network. For example, clocking andpower collapse and many other implementation details are beyond thescope of the standard specification.

In an aspect, the DVM targets 206A to 206N may be SMMUs. An SMMU is aDVM target comprising a TLB that receives DVM operations from the DVMnetwork, such as the DVM network 204. When an SMMU receives a TLBinvalidate operation over the DVM network 204, for example, the SMMU: 1)returns a TLB invalidate acknowledgement to the DVM network 204, and 2)performs the TLB invalidate on the TLB (and any cached translations).When an SMMU receives a sync operation over the DVM network 204, forexample, the SMMU: 1) ensures that previously posted TLB invalidates areperformed, and 2) ensures that client requests (e.g., read/write/etc.)that were using old/targeted TLB entries have been globally observedbefore returning a “sync complete.”

FIG. 3A illustrates an exemplary “TLB Invalidate by Virtual Address”(TLBIVA) operation performed by the system 200 of FIG. 2 according to atleast one aspect of the disclosure. The DVM initiator 202 issues aTLBIVA operation (represented as block 310) over the DVM network 204.The DVM network 204 broadcasts the TLBIVA operation (represented asblock 312) to all DVM targets, i.e., DVM targets 206A to 206N. Each DVMtarget 206A to 206N acknowledges receipt of the TLBIVA operation andprovides an acknowledgement response (represented as block 314) to theDVM network 204. The DVM network 204 merges all acknowledgment responsesfrom the DVM targets and presents a unified receipt response(represented as block 316) back to the DVM initiator 202.

For power optimization and/or performance reasons, it is sometimesdesirable to have SMMUs/DVM targets on a separate clock domain and aseparate power domain. FIG. 3B illustrates the system 200 of FIG. 2 inwhich each of the DVM initiators 202, the DVM network 204, and the DVMtargets 206A to 206N are on separate clock and power domains accordingto at least one aspect of the disclosure. In the example of FIG. 3B, theDVM initiator 202 is on its own clock and power domain 322, the DVMnetwork 204 is on its own clock and power domain 324, the DVM targets206A and 206B are on their own clock and power domain 326, and the DVMtarget 206N is on its own clock and power domain 328. The DVM targets206A to 206N may each be on a separate clock and power domain, or all ofthe DVM targets 206A to 206N may be on the same clock and power domain,or different groups of the DVM targets 206A to 206N may be on differentclock and power domains (as illustrated in FIG. 3B).

The introduction of multiple clock and multiple power domains within theDVM network 204 can place a burden on software if the clock domains andpower domains are software managed/controlled. In such cases, when a TLBinvalidate is issued from the CPU (such as the CPU(s) 102 in FIG. 1),and/or the DVM initiator 202 and the DVM network 204 must be softwaremanaged when a DVM target of the DVM targets 206A to 206N is powercollapsed (i.e., powered off).

Having the DVM network 204 and all of the DVM targets 206A to 206N on asingle clock and power domain simplifies the problem. However, this maylead to undesirable latency performance profiles if, for example, theDVM network 204 and the DVM targets 206A to 206N are forced to operateon a single fast/slow clock. It may also lead to undesirable powerperformance profiles if, for example, the DVM network 204 and the DVMtargets 206A to 206N are “always powered.”

Accordingly, the present disclosure presents a mechanism for fullhardware management of the power and clock domains relating to a DVMnetwork, such as the DVM network 204, and the DVM targets, such as theDVM targets 206A to 206N. In an aspect, the disclosed hardware mechanismcan 1) turn on the relevant clocks based on the presence of a DVMoperations in the DVM network 204 (and then, when the operation is done,the relevant clocks are turned back off), 2) speed up the relevantclocks based on the presence of a DVM operation in the DVM network 204(and then, when the operation is done, the relevant clocks are slowedback down), and/or 3) automatically terminate DVM operations that arebroadcast to the DVM targets 206A to 206N that have power collapsed, asappropriate. Optionally, a DVM target 206A to 206N that is powercollapsed can be “powered-up” based on the presence of DVM operations inthe DVM network 204.

Points 1 and 2 above ensure low latency (due to the high performance DVMnetwork response). The impact of the present disclosure includesreleasing the software from the burden of having to software manage theDVM network 204 prior to a DVM target 206A to 206N being powered off,and releasing the software from the burden of having to software manageclocks prior to the DVM initiator 202 issuing a DVM operation (e.g., aTLB invalidate).

FIG. 4 illustrates an exemplary system 400 for full-hardware managementof power and clock domains related to a DVM network and DVM targetsaccording to at least one aspect of the disclosure. The system 400includes a CPU subsystem 402, which may correspond to the CPU 102 inFIG. 1, that acts as a DVM initiator. The CPU subsystem/DVM initiator402 issues commands/DVM operations to a DVM network 404, which may bepart of the system bus 110 in FIG. 1, via a DVM master port 412 at theCPU subsystem/DVM initiator 402 and a DVM slave port 414 at the DVMnetwork 404. The DVM network 404 broadcasts the commands/DVM operationsto the DVM targets, such as DVM target 406 (e.g., an SMMU), which maycorrespond to SMMU 116 in FIG. 1.

A DVM interceptor 428 ensures that no DVM operations pass through to theDVM targets unless all downstream target DVM clocks are turned on. TheDVM interceptor 428 includes logic to stop any DVM operations until therelevant clocks are turned on. The DVM interceptor 428 communicates witha clock manager 410, which is responsible for turning on any clocksrelated to the DVM operations that are turned off.

When a DVM target, such as the DVM target 406, is power collapsed, a DVMdisconnect module 426 communicates with a power collapse manager 420 toensure the proper shutdown of the DVM target 406. The power collapsemanager 420 communicates with the DVM target 406 to ensure the propershutdown/power collapse of the DVM target 406. The power collapsemanager 420, via the DVM disconnect module 426, communicates with theDVM network 404 to ensure that the DVM network 404 provides the properresponse to the DVM initiator, i.e., the CPU subsystem/DVM initiator402. The power collapse manager 420 also reads “Power Off Requests” fromand writes Power Off Status” to the registers for the power collapseinterface 440.

A clock bridge 424 is an interconnection device that allowscommunication (DVM communication in this case) between two separateclock domains. For example, here, the DVM network 404 is on one clockdomain while the DVM target 406 is on a separate clock domain, thusrequiring a “clock bridge” to bridge the two clock domains.

The CPU subsystem/DVM initiator 402 may issue a dynamic clock divide(DCD) wakeup command to clock selectors 432A and 432B. The clockselectors 432A and 432B select the fastest clock when there is DVMactivity by causing the clock dividers 434A and 434B to be bypassed.More specifically, when the DCD wakeup command is “1,” the clockselectors 432A and 432B cause the multiplexors coupled to the clockdividers 434A and 434B to select the undivided clock signal and send itto the clock manager 410. This causes the corresponding clock circuitryto speed up the clock.

The clock manager 410 may also receive votes to keep a given clock on,represented in FIG. 4 as SoftwareClockONRequest(s). As long as there isat least one vote to keep a clock on, that clock will remain on.

FIG. 5 illustrates an exemplary flow for power collapsing (i.e.,powering off) a DVM target, such as the DVM target 406 (e.g., an SMMU),in the system 400 of FIG. 4 according to at least one aspect of thedisclosure.

At 502, the power collapse manager 420 receives a request from thesoftware being currently executed to power collapse the DVM target 406.The software asserts the request by writing a “Power Collapse Request”in the registers for the power collapse interface 440 in FIG. 4.Alternatively, instead of the software triggering a power-off of the DVMtarget 406, a signal from the CPU subsystem/DVM initiator 402 in FIG. 4or the DVM network 404 may trigger the power-off sequence. This signalwould indicate that no pending DVM requests in the DVM network 404 arepermitted to trigger a power-collapse event when there is no otheractivity that causes the DVM target 406 to be powered. The powercollapse manager 420 would receive this signal and use it as a means todetermine when to power down the DVM target 406. The TLB contents of theDVM target 406 would be “retained” even when the main power is “off” byway of retention circuits, or a secondary storage unit.

At 504, the power collapse manager 420 issues a DVMDisconnectRequestmessage to the DVM network 404 to safely disconnect the DVM target 406from the DVM network 404. At 506, once the DVM network 404 receives theDVMDisconnectRequest message, the DVM network 404 safely terminates anynew DVM operations, such that new DVM operations do not reach the DVMtarget 406. Terminating the DVM operations ensures that any new DVMoperation is acknowledged/completed and that the DVM initiator (e.g.,the CPU subsystem 402) receives a valid non-error response indicatingthat the terminated transaction was acknowledged/completed “normally”(i.e., without error).

At 508, the DVM network 404 ensures that all previously pending DVMoperations are acknowledged or completed by the DVM target 406. At 510,the DVM network 404 returns a DVMDisconnectReady message once allpending DVM operations are acknowledged or completed by the DVM target406.

At 512, the power collapse manager 420 receives the DVMDisconnectReadymessage from the DVM network 404. At 514, the power collapse manager 420issues a SMMUPowerCollapseRequest message to the DVM target 406.

At 516, once the DVM target 406 receives the power collapse request, theDVM target 406 blocks any new client requests (e.g., DVM operations fromthe CPU subsystem/DVM initiator 402). At 518, the DVM target 406 waitsuntil any pending activity is completed (e.g., all outstanding clientrequests are completed, and all outstanding translation table walks arecompleted). At 520, once all pending activity in the DVM target 406 iscompleted, the DVM target 406 returns a SMMUPowerCollapseReady message.

At 522, the power collapse manager 420 issues a PowerONControl=0 messageto a power controller 408. Once the power controller 408 receives thePowerONControl=0 message, it removes the power from the DVM target 406(thereby power collapsing the DVM target 406) and associated components.

At 524, the power collapse manager 420 returns a power status signalindicating that the power has been removed from the DVM target 406. Thisstatus is readable by the software via the registers for the powercollapse interface 440.

FIG. 6 illustrates an exemplary flow for powering on a DVM target, suchas the DVM target 406 (e.g., an SMMU), in the system 400 of FIG. 4according to at least one aspect of the disclosure.

At 602, the power collapse manager 420 receives a request from thesoftware currently being executed to power ON the DVM target 406. Thesoftware asserts the request by de-asserting the “Power CollapseRequest” in the registers for the power collapse interface 440.Alternatively, instead of the software triggering a power-ON of the DVMtarget 406, a handshake from the DVM network 404 may trigger thepower-ON sequence. This handshake would be performed if a DVM operationis targeting the DVM target 406. The power collapse manager 420 wouldreceive this power-ON request from the DVM network 404 and complete thehandshake when the DVM target 406 is powered on.

At 604, the power collapse manager 420 issues a PowerONControl=1 messageto the power controller 408. At 606, the power collapse manager 420waits until the DVM target 406 is fully powered on. More specifically,the power collapse manager 420 waits for the power-ON status indicatorfrom the power controller 408. Once the power controller 408 receivesthe PowerONControl=1 message, the DVM target 406 will be powered on.Once power is restored to the DVM target 406, the power controller 408issues a reset message to the DVM target 406, and also issues a TLBreset message to the DVM target 406 to ensure that the TLB is “invalid”and contains no valid information. Note, this operation may not beperformed for the alternative described above with reference to 602,since the TLB would contain valid information.

At 608, the power collapse manager 420 asserts a power-ON request to theDVM target 406 by de-asserting the SMMUPowerCollapseRequest message forthe DVM target 406.

At 610, once the DVM target 406 receives the power-ON request, the DVMtarget 406 unblocks any client requests (e.g., DVM operations from theCPU subsystem/DVM initiator 402). At 612, the DVM target 406 returns aSMMUPowerCollapseReady=1 message to acknowledge the power-ON request andto indicate that it is ready for a subsequent power collapse.

At 614, the power collapse manager 420 asserts the power-ON request tothe DVM network 404 by de-asserting the DVMDisconnectRequest message tothe DVM network 404 to reconnect the DVM target 406 to the DVM network404.

At 616, once the DVM network 404 receives the reconnect request from thepower collapse manager 420 (i.e., the DVMDisconnectRequest=0 message),the DVM network 404 stops terminating any new DVM operations andforwards them (as normal) to the DVM target 406. At 618, the DVM network404 returns an acknowledgement of the power-ON request to the powercollapse manager 420.

At 620, the power collapse manager 420 waits for an acknowledgement fromthe DVM network 404. At 622, the power collapse manager 420 returns apower status signal indicating that the power has been applied to theDVM target 406. This status is readable by the software via theregisters for the power collapse interface 440.

FIG. 7A illustrates an exemplary flow for automatic clock wake-up in thesystem 400 of FIG. 4 according to at least one aspect of the disclosure.At 702, the DVM initiator 402 broadcasts a DVM operation on the DVMnetwork 404. The present disclosure includes software programmedprovisions to exclude a DVM target, such as the DVM target 406, fromreceiving DVM operations. Accordingly not all DVM targets will receivethe “broadcasted” DVM operation.

At 704, the DVM initiator 402 asserts a DCDWakeUpRequest signal as an“early” indication that there is a pending DVM operation. Note, theDCDWakeUPRequest signal is an “early” indication of a pending DVMrequest since it is asserted long before the DVM operation reaches theDVM target 406. At 706, the clock selectors 432A and/or 432B receive theDCDWakeUpRequest signal and respond by switching to a faster clockfrequency source and/or selecting the non-divided clock signal to sendto the clock manager 410. The clock manager 410 uses thesefaster/non-divided clocks as the clock source for the SMMU/DVM target.By using the “faster” non-divided clock, the DVM network 404 and DVMtargets are able to respond faster to the DVM operations that arebroadcast over the DVM network 404.

At 708, the clock manager 410, the DVM network 404, and the DVM targets406 use the faster clocks. At 710, the clock manager 410, the DVMnetwork 404, and the DVM targets 406 receive and perform the DVMoperation broadcasted at 702.

At 712, the DVM initiator 402 waits for responses from the DVM network404. At 714, the DVM initiator 402 determines whether or not there areany new DVM operations. If there are, the flow returns to 702. If not,the flow proceeds to 716. At 716, the DVM initiator 402 de-asserts theDCDWakeUpRequest. When the DCDWakeUpRequest signal is de-asserted, theDVM-related clocks can be switched back to the divided clocks for powersavings.

FIG. 7B illustrates an exemplary flow for automatic clock wake-up in thesystem 400 of FIG. 4 according to at least one aspect of the disclosure.At 722, a DVM operation is broadcasted on the DVM network 404 in FIG. 4,as at 702 of FIG. 7A.

At 724, the CPU subsystem/DVM initiator 402 asserts a DCDWakeUpRequestsignal as an “early” indication that there is a pending DVM operation,as at 704 of FIG. 7A. Note, the DCDWakeUPRequest signal is an “early”indication of a pending DVM request since it is asserted long before theDVM operation reaches the DVM target 406. The clock selectors 432Aand/or 432B receive the DCDWakeUpRequest signal and respond by selectingthe non-divided clock to send to the clock manager 410. The clockmanager 410 uses these non-divided clocks as the clock source for theSMMU/DVM target. By using the “faster” non-divided clock, the DVMnetwork 404 and DVM targets are able to respond faster to the DVMoperations that are broadcast over the DVM network 404.

At 726, another DVM operation is broadcast over the DVM network 404 toall (or some) of the DVM targets. The present disclosure includessoftware programmed provisions to exclude a DVM target, such as the DVMtarget 406, from receiving DVM operations. Accordingly not all DVMtargets will receive the “broadcasted” DVM operation.

At 728, the DVM interceptor 428 “intercepts” the DVM operation insidethe DVM master port 412 of the DVM network 404. At 730, the DVMinterceptor 428 blocks the DVM operation until the DVM targets' clocksare ON (referred to as “toggling”). At 732, the DVM interceptor 428issues a DVMSMMUClockONRequest to the clock manager 410.

At 734, the clock manager 410 will ensure that the clock gating elementsare disabled and that the clocks relating to the DVM network componentsand DVM targets are ON. At 736, once the clocks relating to DVM networkcomponents and DVM targets are ON, the clock manager 410 returns aDVMSMMUClockONReady response to the DVM interceptor 428.

At 738, the DVM interceptor 428 waits until the clock manager 410returns the DVMSMMUClockONReady response. At 740, the DVM interceptor428 “unblocks” and allows the DVM operation to proceed to the DVM target406 (assuming it is not in the process of being power collapsed). TheDVM target asserts “SMMUIsActive” signal for as long as the DVM target406 is actively processing the DVM operation (or any other operation).When all of the DVM responses have returned and the DCDWakeUpRequestsignal is de-asserted, the DVM-related clocks can be switched back tothe divided clocks for power savings.

At 742, when all DVM responses have returned and exited the DVM network404, the DVM interceptor 428 eventually stops requesting that the clocksbe turned ON by deasserting the request signal DVMClockONRequest. Thiscan be done, for example, when there are no DVM requests pending at theDVM interceptor 428 and when an amount of time (for example, a fixednumber of clock cycles) has elapsed since the last DVM request waspending at the DVM interceptor 428. At this time, at 744, the clockmanager 410 may decide to shut-off the clocks (referred to as “notoggle”) if no other agent will use those clocks.

FIG. 8 illustrates an exemplary flow for full-hardware management ofpower and clock domains related to a DVM network according to at leastone aspect of the disclosure. The flow illustrated in FIG. 8 may beperformed by the system 400 in FIG. 4.

At 802, a DVM initiator, such as the CPU subsystem/DVM initiator 402,transmits a DVM operation to a DVM network, such as the DVM network 404.

At 804, the DVM network, such as DVM network 404, broadcasts the DVMoperation to a plurality of DVM targets, such as the DVM target 406.

At 806, based on the pending DVM operation being broadcasted to theplurality of DVM targets by the DVM network, one or more hardwarefunctions are performed.

For example, at 812, a clock domain coupled to the DVM network (e.g.,clock domain 324) or a DVM target (e.g., clock domain(s) 326/328) of theplurality of DVM targets that are a target of the DVM operation may beturned on.

Alternatively or additionally, at 814, a frequency of the clock domaincoupled to the DVM network (e.g., clock domain 324) or the DVM target(e.g., clock domain(s) 326/328) of the plurality of DVM targets that isthe target of the DVM operation may be increased.

Alternatively or additionally, at 816, a power domain coupled to a DVMtarget of the plurality of DVM targets that is the target of the DVMoperation is turned on based on the power domain being turned off.

Alternatively or additionally, at 818, the DVM operation to the DVMtarget of the plurality of DVM targets that is the target of the DVMoperation is terminated based on the DVM target being turned off. Morespecifically, if the DVM target of the plurality of DVM targets that isthe target of the DVM operation is turned off, the DVM operation isterminated.

There are a number of benefits of hardware-managed power collapse, asdisclosed herein. For example, the power management software can freelypower collapse DVM targets without having to synchronize/coordinate withsoftware that may be using the DVM networks. In-flight DVM operationsstill complete “successfully,” even when targeting a DVM target with nopower. Further, in some aspects, the issuance of a DVM operation willoptionally power-ON the DVM targets without explicit instruction tomanage power from the software.

There are also a number of benefits of hardware-managed clocking, asdisclosed herein. For example, the issuance of a TLB invalidateinstruction will turn-ON the associated clocks on the DVM network andDVM targets without explicit instruction to manage the clocks from thesoftware. Further, the issuance of a TLB invalidate instruction (forexample) will speed-up the associated clocks on the DVM network and DVMtargets without explicit instruction to manage the clocks from thesoftware. The result is a faster DVM network that does not rely onsoftware management.

Other aspects of the disclosure include provisions to optionally andprogrammatically exclude a DVM target from participating in the DVMnetwork. The programmability of said controls is softwarereadable/writable from “privileged” or “secure” software. Other aspectsinclude a provision/facility to automatically switch the clock source toan “always present” fast clock when the phase lock loops (PLLs) aredisabled.

Those of skill in the art will appreciate that information and signalsmay be represented using any of a variety of different technologies andtechniques. For example, data, instructions, commands, information,signals, bits, symbols, and chips that may be referenced throughout theabove description may be represented by voltages, currents,electromagnetic waves, magnetic fields or particles, optical fields orparticles, or any combination thereof.

Further, those of skill in the art will appreciate that the variousillustrative logical blocks, modules, circuits, and algorithm stepsdescribed in connection with the aspects disclosed herein may beimplemented as electronic hardware, computer software, or combinationsof both. To clearly illustrate this interchangeability of hardware andsoftware, various illustrative components, blocks, modules, circuits,and steps have been described above generally in terms of theirfunctionality. Whether such functionality is implemented as hardware orsoftware depends upon the particular application and design constraintsimposed on the overall system. Skilled artisans may implement thedescribed functionality in varying ways for each particular application,but such implementation decisions should not be interpreted as causing adeparture from the scope of the present disclosure.

The various illustrative logical blocks, modules, and circuits describedin connection with the aspects disclosed herein may be implemented orperformed with a general purpose processor, a digital signal processor(DSP), an application specific integrated circuit (ASIC), a fieldprogrammable gate array (FPGA) or other programmable logic device,discrete gate or transistor logic, discrete hardware components, or anycombination thereof designed to perform the functions described herein.A general purpose processor may be a microprocessor, but in thealternative, the processor may be any conventional processor,controller, microcontroller, or state machine. A processor may also beimplemented as a combination of computing devices, e.g., a combinationof a DSP and a microprocessor, a plurality of microprocessors, one ormore microprocessors in conjunction with a DSP core, or any other suchconfiguration.

The methods, sequences and/or algorithms described in connection withthe aspects disclosed herein may be embodied directly in hardware, in asoftware module executed by a processor, or in a combination of the two.A software module may reside in RAM memory, flash memory, ROM memory,EPROM memory, EEPROM memory, registers, hard disk, a removable disk, aCD-ROM, or any other form of storage medium known in the art. Anexemplary storage medium is coupled to the processor such that theprocessor can read information from, and write information to, thestorage medium. In the alternative, the storage medium may be integralto the processor. The processor and the storage medium may reside in anASIC. The ASIC may reside in a user terminal (e.g., UE). In thealternative, the processor and the storage medium may reside as discretecomponents in a user terminal.

In one or more exemplary aspects, the functions described may beimplemented in hardware, software, firmware, or any combination thereof.If implemented in software, the functions may be stored on ortransmitted over as one or more instructions or code on acomputer-readable medium. Computer-readable media includes both computerstorage media and communication media including any medium thatfacilitates transfer of a computer program from one place to another. Astorage media may be any available media that can be accessed by acomputer. By way of example, and not limitation, such computer-readablemedia can comprise RAM, ROM, EEPROM, CD-ROM or other optical diskstorage, magnetic disk storage or other magnetic storage devices, or anyother medium that can be used to carry or store desired program code inthe form of instructions or data structures and that can be accessed bya computer. Also, any connection is properly termed a computer-readablemedium. For example, if the software is transmitted from a website,server, or other remote source using a coaxial cable, fiber optic cable,twisted pair, digital subscriber line (DSL), or wireless technologiessuch as infrared, radio, and microwave, then the coaxial cable, fiberoptic cable, twisted pair, DSL, or wireless technologies such asinfrared, radio, and microwave are included in the definition of medium.Disk and disc, as used herein, includes compact disc (CD), laser disc,optical disc, digital versatile disc (DVD), floppy disk and blu-ray discwhere disks usually reproduce data magnetically, while discs reproducedata optically with lasers. Combinations of the above should also beincluded within the scope of computer-readable media.

While the foregoing disclosure shows illustrative aspects of thedisclosure, it should be noted that various changes and modificationscould be made herein without departing from the scope of the disclosureas defined by the appended claims. The functions, steps and/or actionsof the method claims in accordance with the aspects of the disclosuredescribed herein need not be performed in any particular order.Furthermore, although elements of the disclosure may be described orclaimed in the singular, the plural is contemplated unless limitation tothe singular is explicitly stated.

What is claimed is:
 1. A method for full-hardware management of powerand clock domains related to a distributed virtual memory (DVM) network,comprising: transmitting, from a DVM initiator of a processor-basedsystem of a device to the DVM network, a DVM operation; broadcasting, bythe DVM network to each of a plurality of DVM targets physically coupledto the processor-based system of the device, the same DVM operation,wherein the plurality of DVM targets comprises a plurality of memorymanagement units, wherein the DVM network is included in a system bus ofthe processor-based system of the device between the DVM initiator andthe plurality of DVM targets, and wherein the DVM network combinesresponses to the DVM operation received from the plurality of DVMtargets into a single response for the DVM initiator; and based on theDVM operation being broadcasted to the plurality of DVM target by theDVM network, performing one or more hardware functions comprising:turning on a clock domain coupled to the DVM network or a DVM target ofthe plurality of DVM targets that is a target of the DVM operation,increasing a frequency of the clock domain coupled to the DVM network orthe DVM target of the plurality of DVM targets that is the target of theDVM operation, terminating the DVM operation to the DVM target of theplurality of DVM targets that is the target of the DVM operation basedon the DVM target being turned off, or any combination thereof.
 2. Themethod of claim 1, wherein turning on the clock domain coupled to theDVM target of the plurality of DVM targets that is the target of the DVMoperation comprises: asserting, by the DVM initiator, a wakeup requestto the clock domain coupled to the DVM target; blocking, by the DVMnetwork, the DVM operation while the clock domain coupled to the DVMtarget is turned off; turning on the clock domain coupled to the DVMtarget; and based on the clock domain coupled to the DVM target beingturned on, unblocking the DVM operation and transmitting, by the DVMnetwork, the DVM operation to the DVM target of the plurality of DVMtargets that is the target of the DVM operation.
 3. The method of claim1, wherein increasing the frequency of the clock domain coupled to theDVM target of the plurality of DVM targets that is the target of the DVMoperation comprises: asserting, by the DVM initiator, a wakeup requestto the clock domain coupled to the DVM target; selecting, by a clockselector unit coupled to the DVM initiator, a non-divided version of theclock domain coupled to the DVM target; and based on the non-dividedversion of the clock domain coupled to the DVM target being selected,transmitting, by the DVM network, the DVM operation to the DVM targetthat is the target of the DVM operation.
 4. The method of claim 1,wherein terminating the DVM operation to the DVM target of the pluralityof DVM targets that is the target of the DVM operation based on the DVMtarget being turned off comprises: receiving, from a DVM disconnectmodule, a command to terminate subsequent DVM operations.
 5. The methodof claim 1, wherein terminating the DVM operation to the DVM target ofthe plurality of DVM targets that is the target of the DVM operationbased on the DVM target being turned off comprises: terminating, by theDVM network, the DVM operation; and generating, by the DVM network, aresponse to the DVM operation.
 6. The method of claim 1, wherein the DVMinitiator comprises a processor.
 7. The method of claim 1, wherein theDVM operation comprises a translation lookaside buffer (TLB) invalidateoperation, a synchronization operation, or any combination thereof. 8.The method of claim 1, wherein the DVM initiator is coupled to aseparate clock domain and a separate power domain from a clock domainand a power domain of the DVM network.
 9. The method of claim 1, whereinthe plurality of DVM targets are coupled to clock domains and powerdomains separate from a clock domain and a power domain of the DVMinitiator and a clock domain and a power domain of the DVM network. 10.The method of claim 1, wherein each of the plurality of DVM targets iscoupled to a separate clock domain and a separate power domain fromremaining ones of the plurality of DVM targets.
 11. The method of claim1, wherein the plurality of DVM targets is coupled to a single clockdomain and a power domain.
 12. An apparatus for full-hardware managementof power and clock domains related to a distributed virtual memory (DVM)network, comprising: a DVM initiator of a processor-based system of adevice; a plurality of DVM targets physically coupled to theprocessor-based system of the device; a DVM network physically coupledto the DVM initiator and the plurality of DVM targets, wherein theplurality of DVM targets comprises a plurality of memory managementunits, wherein the DVM network is included in a system bus of theprocessor-based system of the device between the DVM initiator and theplurality of DVM targets, wherein the DVM network is configured tobroadcast the same DVM operation from the DVM initiator to each of theplurality of DVM targets, and wherein the DVM network combines responsesto the DVM operation received from the plurality of DVM targets into asingle response for the DVM initiator, wherein, based on a DVM operationin the DVM network being broadcasted to the plurality of DVM targets: aclock domain coupled to the DVM network or a DVM target of the pluralityof DVM targets that is a target of the DVM operation is turned on, afrequency of the clock domain coupled to the DVM network or the DVMtarget of the plurality of DVM targets that is the target of the DVMoperation is increased, the DVM operation to the DVM target of theplurality of DVM targets that is the target of the DVM operation isterminated based on the DVM target being turned off, or any combinationthereof.
 13. The apparatus of claim 12, wherein the plurality of memorymanagement units each comprise a translation lookaside buffer (TLB). 14.The apparatus of claim 12, wherein the DVM initiator comprises aprocessor.
 15. The apparatus of claim 12, wherein the DVM operationscomprise a TLB invalidate operations, synchronization operations, or anycombination thereof.
 16. The apparatus of claim 12, wherein the DVMinitiator is coupled to a separate clock domain and a separate powerdomain from a clock domain and a power domain of the DVM network. 17.The apparatus of claim 12, wherein the plurality of DVM targets arecoupled to clock domains and power domains separate from a clock domainand a power domain of the DVM initiator and a clock domain and a powerdomain of the DVM network.
 18. The apparatus of claim 12, wherein eachof the plurality of DVM targets is coupled to a separate clock domainand a separate power domain from remaining ones of the plurality of DVMtargets.
 19. The apparatus of claim 12, wherein the plurality of DVMtargets is coupled to a single clock domain and a power domain.
 20. Theapparatus of claim 12, wherein the DVM network reports the singleresponse to the DVM initiator.
 21. The apparatus of claim 12, whereinbased on the DVM operation to the DVM target of the plurality of DVMtargets that is the target of the DVM operation being terminated, theDVM network responds to the DVM initiator on behalf of the DVM target.22. An apparatus for full-hardware management of power and clock domainsrelated to a distributed virtual memory (DVM) network, comprising: meansfor broadcasting of a processor-based system of a device communicativelycoupled to a plurality of DVM targets coupled to the processor-basedsystem of the device; means for transmitting of the processor-basedsystem of the device, to the means for broadcasting, a DVM operation,wherein the plurality of DVM targets comprises a plurality of memorymanagement units, wherein the means for broadcasting is included in asystem bus of the processor-based system of the device between the meansfor transmitting and the plurality of DVM targets, wherein the means forbroadcasting is configured to broadcast, to each of the plurality of DVMtargets, the same DVM operation, and wherein the means for broadcastingis configured to combine responses to the DVM operation received fromthe plurality of DVM targets into a single response for the means fortransmitting; and means for performing, based on the DVM operation beingbroadcasted to the plurality of DVM targets by the DVM network, one ormore hardware functions comprising: turn on a clock domain coupled tothe DVM network or a DVM target of the plurality of DVM targets that isa target of the DVM operation, increase a frequency of the clock domaincoupled to the DVM network or the DVM target of the plurality of DVMtargets that is the target of the DVM operation, terminate the DVMoperation to the DVM target of the plurality of DVM targets that is thetarget of the DVM operation based on the DVM target being turned off, orany combination thereof.
 23. The apparatus of claim 22, wherein theplurality of memory management units each comprise a translationlookaside buffer (TLB).
 24. The apparatus of claim 22, wherein the DVMoperations comprise a TLB invalidate operations, synchronizationoperations, or any combination thereof.
 25. The apparatus of claim 22,wherein the means for transmitting is coupled to a separate clock domainand a separate power domain from a clock domain and a power domain ofthe DVM network.
 26. The apparatus of claim 22, wherein the plurality ofDVM targets are coupled to clock domains and power domains separate froma clock domain and a power domain of the means for transmitting and aclock domain and a power domain of the DVM network.
 27. The apparatus ofclaim 22, wherein each of the plurality of DVM targets is coupled to aseparate clock domain and a separate power domain from remaining ones ofthe plurality of DVM targets.
 28. The apparatus of claim 22, wherein theplurality of DVM targets is coupled to a single clock domain and a powerdomain.
 29. The apparatus of claim 22, wherein the DVM network reportsthe single response to the means for transmitting.
 30. The apparatus ofclaim 22, wherein based on the DVM operation to the DVM target of theplurality of DVM targets that is the target of the DVM operation beingterminated, the DVM network responds to the means for transmitting onbehalf of the DVM target.